⬅ Back to Menu

Difference Between Fixed-Effects and Random-Effects Models in Meta-Analysis?

First, we need to understand the differences between fixed-effects and random-effects models. Both are methods used to combine the results of individual studies into an overall summary estimate!


Sources of Error:


In Summary:


Which Method to Choose?


Types of Heterogeneity?

Before we start discussing heterogeneity measures, we should first clarify that heterogeneity can mean different things.

  1. Baseline (or designed-related) heterogeneity:

    This type of heterogeneity occurs when the populations, interventions, comparators, or outcomes differ across the included studies. Such variability stems from differences in how studies are conducted, their settings, or the characteristics of the participants.

    For instance:

    • A meta-analysis including studies on various age groups, disease severities, or clinical settings might exhibit baseline heterogeneity.

    • Differences in methodologies, such as trial follow-up, dosages, or measurement tools, can also contribute.

    • Addressing baseline heterogeneity starts with a well-defined research question and a priori inclusion criteria, often guided by the PICOT framework. By clearly defining the population, intervention, comparator, and outcomes at the outset, researchers can minimize variability introduced by baseline differences.

  2. Statistical Heterogeneity:

    Statistical heterogeneity quantifies the variability in effect size estimates across studies included in the meta-analysis. It reflects how much the observed effects differ from what would be expected under a single fixed effect size.

    • Baseline heterogeneity may lead to statistical heterogeneity when differences in populations or study designs result in varying treatment effects across studies.

    • Baseline heterogeneity does not always translate to statistical heterogeneity.Baseline heterogeneity does not always translate to statistical heterogeneity - Studies with different designs may still produce consistent effect sizes, leading to low statistical heterogeneity.

    • Statistical heterogeneity can exist even without baseline heterogeneity - For example, a meta-analysis of very similar studies may show high variability in effect sizes due to random sampling error or small sample sizes.


Measures of Heterogeneity?

These three measures are related to heterogeneity in meta-analysis, but they serve different purposes. Here’s an easy way to understand them:

  1. Q ( Chi2 ) – “Is There Heterogeneity?”

    • What it does: Tests if there is more variability in the study results than expected by chance.

    • Practical use: Answers the question, “Is heterogeneity present?”

    • How to interpret:

      • Low Q (Chi2) and non-sgnificant (p > 0.10): suggests that the variability is likely due to chance.

      • High Q (Chi2) and significant (p ≤ 0.10): suggests that the variability is likely due to chance.

  2. Tau2 – “How Much Heterogeneity?”

    • What it does: Measures the absolute amount of variance in the true effect sizes across studies.

    • Practical use: Quantifies how much the effect sizes differ between studies.

    • How to interpret:

      • Tau2 = 0: No variability between studies (all studies estimate the same effect size).Greater variability in the true effects.

      • Larger Tau2: Greater variability in the true effects.

  3. I2 - “What Percentage of the Variability is Due to Heterogeneity?”

    • What it does: Indicates the proportion of total variability in the effect sizes that is due to heterogeneity (not chance).

    • Practical use: Helps understand how much of the variability is caused by true differences between studies versus random sampling error.

    • How to interpret:

      • I2 = 0: No heterogeneity (all variability is due to chance).

      • I2 ≤ 25: Low heterogeneity.

      • I2 > 26-50: moderate heterogeneity.

      • I2 ≥ 50: substantial heterogeneity.

  4. Prediction Interval (PI)

    • What it does: Provides a range where the true effect of a future study is expected to fall. Accounts for both within-study sampling error and between-study variability. Complements Tau2 and I2 , which describe variability but do not predict future outcomes. Prediction intervals (PIs) provide the expected range of true effects in future studies, accounting for both within-study error and between-study variability ( Tau2 ). Unlike confidence intervals (CIs), which estimate the precision of the pooled effect, PIs reflect the variability in potential real-world applications.

    • Practical use: Adds clinical relevance by showing how results might vary in new settings. The prediction interval reflects the variation in true treatment effects over different settings, including what effect is to be expected in future patients, such as the patients that a clinician is interested to treat. Therefore, it should be routinely reported in addition to the summary effect and its CI, and used as a main tool for interpreting evidence, to enable more informed clinical decision-making.

    • How to interpret:

      • Narrow PI: Suggests consistent effects across studies = Meta-analyses based on many studies and with low estimated heterogeneity.

      • Wide PI: Indicates high uncertainty or substantial heterogeneity = meta-analyses with few studies and substantial heterogeneity.

      • PI Including the Opposite Effect:

        • A PI that spans the null effect (e.g., OR = 1.0) suggests uncertainty about the treatment’s effectiveness.

        • If the PI contains the exact opposite effect (e.g., pooled OR = 2.0, PI includes OR = 0.5), it indicates substantial heterogeneity, making the pooled estimate less reliable.

      • Meta-Analyses with I2 = 0: A substantial part of meta-analyses have an estimated of I2 0. However, there is typically very large uncertainty about the exact amount of heterogeneity, and this is demonstrated by very large 95% CIs for the values of I2. So, when I2 = 0 , rely on the prediction interval to assess variability and generalization across studies.


Summary

Measure What it does Question It Answers Interpretation
Q ( Chi2 ) Tests if heterogeneity is present. “Is there heterogeneity?” Significant Q = heterogeneity exists.
Tau2

Quantifies the absolute heterogeneity.

Absolute measure of heterogeneity

“How much heterogeneity is there?” Larger Tau2 = more between-study variability.
I2

Proportion of variability due to heterogeneity.

Relative measure of heterogeneity

“What percentage of variability is due to heterogeneity?” Higher I2 = more heterogeneity.
Prediction Interval Provides a range for true effects in future studies. Adds clinical relevance by predicting outcomes in new studies. Wide PI = Greater uncertainty; Narrow PI = Consistent effects.

In sum, it is advisable to not resort to one measure only when characterizing the heterogeneity of a meta-analysis. It is recommended to at least always report 𝐼2 (with confidence intervals), as well as prediction intervals, and interpret the results accordingly.

Example 1:


Interpreting the results:

First, Is there heterogeneity? - We see that Q (Chi2), the Test of heterogeneity, is 49.46 This is a lot more than what we would expect based on the number of studies −1 = 13 degrees of freedom (df) in this analysis. Consequentially, the heterogeneity test is significant (p = 0.01).

Second, How much heterogeneity is there? - We see that Tau2 = 0.11, indicating that some between-study heterogeneity exists in our data.

Third, What percentage of variability is due to heterogeneity? (not chance) - We see that the I2 = 74% (95% CI 55% - 85%) using Higgins and Thompson’s “rule of thumb”, we can characterize this amount of heterogeneity as substantial.


Here is how we could report the amount of heterogeneity we found in our example 1:

“Experimental group was associated with lower rates of all-cause mortality (RR 0.73; 95% CI 0.57 - 0.93; p = 0.01; I2 = 74%) compared with control group. The prediction interval (PI) ranged from 0.33 to 1.58 indicating that negative intervention effects cannot be ruled out for future studies.”


References

  1. Higgins JPT, Thompson SG. Quantifying heterogeneity in a meta-analysis. Statistics in Medicine. 2002;21(11):1539-1558.

  2. IntHout J, Ioannidis JPA, Rovers MM, et al. Plea for routinely presenting prediction intervals in meta-analysis. BMJ Open;2016;6:e010247.

  3. Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.5 (updated August 2024). Cochrane, 2024. Available from www.training.cochrane.org/handbook.

⬅ Back to Menu